AITopics

2503.13951

Country:

Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.24)
Asia > China (0.14)

Genre: Research Report (0.85)

Industry:

Automobiles & Trucks (0.48)
Transportation > Ground (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Data Science > Data Mining > Feature Extraction (0.57)

arXiv.org Artificial IntelligenceMar-12-2025

MsaMIL-Net: An End-to-End Multi-Scale Aware Multiple Instance Learning Network for Efficient Whole Slide Image Classification

Wen, Jiangping, Wen, Jinyu, Fang, Meie

Bag-based Multiple Instance Learning (MIL) approaches have emerged as the mainstream methodology for Whole Slide Image (WSI) classification. However, most existing methods adopt a segmented training strategy, which first extracts features using a pre-trained feature extractor and then aggregates these features through MIL. This segmented training approach leads to insufficient collaborative optimization between the feature extraction network and the MIL network, preventing end-to-end joint optimization and thereby limiting the overall performance of the model. Additionally, conventional methods typically extract features from all patches of fixed size, ignoring the multi-scale observation characteristics of pathologists. This not only results in significant computational resource waste when tumor regions represent a minimal proportion (as in the Camelyon16 dataset) but may also lead the model to suboptimal solutions. To address these limitations, this paper proposes an end-to-end multi-scale WSI classification framework that integrates multi-scale feature extraction with multiple instance learning. Specifically, our approach includes: (1) a semantic feature filtering module to reduce interference from non-lesion areas; (2) a multi-scale feature extraction module to capture pathological information at different levels; and (3) a multi-scale fusion MIL module for global modeling and feature integration. Through an end-to-end training strategy, we simultaneously optimize both the feature extractor and MIL network, ensuring maximum compatibility between them. Experiments were conducted on three cross-center datasets (DigestPath2019, BCNB, and UBC-OCEAN). Results demonstrate that our proposed method outperforms existing state-of-the-art approaches in terms of both accuracy (ACC) and AUC metrics.

artificial intelligence, data mining, machine learning, (18 more...)

2503.08581

Country:

Asia > China (0.14)
South America > Peru (0.14)
Europe > Spain (0.14)

Genre:

Research Report > Promising Solution (0.34)
Research Report > New Finding (0.34)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Diagnostic Medicine (0.70)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Data Science > Data Mining > Feature Extraction (0.78)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Artificial IntelligenceMar-2-2025

One-Shot Affordance Grounding of Deformable Objects in Egocentric Organizing Scenes

Jia, Wanjun, Yang, Fan, Duan, Mengfei, Chen, Xianchi, Wang, Yinxi, Jiang, Yiming, Chen, Wenrui, Yang, Kailun, Li, Zhiyong

Deformable object manipulation in robotics presents significant challenges due to uncertainties in component properties, diverse configurations, visual interference, and ambiguous prompts. These factors complicate both perception and control tasks. To address these challenges, we propose a novel method for One-Shot Affordance Grounding of Deformable Objects (OS-AGDO) in egocentric organizing scenes, enabling robots to recognize previously unseen deformable objects with varying colors and shapes using minimal samples. Specifically, we first introduce the Deformable Object Semantic Enhancement Module (DefoSEM), which enhances hierarchical understanding of the internal structure and improves the ability to accurately identify local features, even under conditions of weak component information. Next, we propose the ORB-Enhanced Keypoint Fusion Module (OEKFM), which optimizes feature extraction of key components by leveraging geometric constraints and improves adaptability to diversity and visual interference. Additionally, we propose an instance-conditional prompt based on image data and task context, effectively mitigates the issue of region ambiguity caused by prompt words. To validate these methods, we construct a diverse real-world dataset, AGDDO15, which includes 15 common types of deformable objects and their associated organizational actions. Experimental results demonstrate that our approach significantly outperforms state-of-the-art methods, achieving improvements of 6.2%, 3.2%, and 2.9% in KLD, SIM, and NSS metrics, respectively, while exhibiting high generalization performance. Source code and benchmark dataset will be publicly available at https://github.com/Dikay1/OS-AGDO.

artificial intelligence, data mining, machine learning, (14 more...)

2503.01092

Genre:

Research Report > Promising Solution (0.68)
Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

arXiv.org Artificial IntelligenceFeb-27-2025

Adaptive H&E-IHC information fusion staining framework based on feature extra

Jia, Yifan, Yu, Xingda, Ji, Zhengyang, Lai, Songning, Yue, Yutao

Immunohistochemistry (IHC) staining plays a significant role in the evaluation of diseases such as breast cancer. The H&E-to-IHC transformation based on generative models provides a simple and cost-effective method for obtaining IHC images. Although previous models can perform digital coloring well, they still suffer from (i) coloring only through the pixel features that are not prominent in HE, which is easy to cause information loss in the coloring process; (ii) The lack of pixel-perfect H&E-IHC groundtruth pairs poses a challenge to the classical L1 loss.To address the above challenges, we propose an adaptive information enhanced coloring framework based on feature extractors. We first propose the VMFE module to effectively extract the color information features using multi-scale feature extraction and wavelet transform convolution, while combining the shared decoder for feature fusion. The high-performance dual feature extractor of H&E-IHC is trained by contrastive learning, which can effectively perform feature alignment of HE-IHC in high latitude space. At the same time, the trained feature encoder is used to enhance the features and adaptively adjust the loss in the HE section staining process to solve the problems related to unclear and asymmetric information. We have tested on different datasets and achieved excellent performance.Our code is available at https://github.com/babyinsunshine/CEFF

data mining, machine learning, natural language, (18 more...)

2502.20156

Country: North America > United States (0.14)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (0.48)
Health & Medicine > Therapeutic Area > Oncology (0.35)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.89)
Information Technology > Data Science > Data Mining > Feature Extraction (0.36)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.30)

arXiv.org Artificial IntelligenceFeb-25-2025

Neurobiber: Fast and Interpretable Stylistic Feature Extraction

Alkiek, Kenan, Wegmann, Anna, Zhu, Jian, Jurgens, David

Linguistic style is pivotal for understanding how texts convey meaning and fulfill communicative purposes, yet extracting detailed stylistic features at scale remains challenging. We present Neurobiber, a transformer-based system for fast, interpretable style profiling built on Biber's Multidimensional Analysis (MDA). Neurobiber predicts 96 Biber-style features from our open-source BiberPlus library (a Python toolkit that computes stylistic features and provides integrated analytics, e.g., PCA and factor analysis). Despite being up to 56 times faster than existing open source systems, Neurobiber replicates classic MDA insights on the CORE corpus and achieves competitive performance on the PAN 2020 authorship verification task without extensive retraining. Its efficient and interpretable representations readily integrate into downstream NLP pipelines, facilitating large-scale stylometric research, forensic analysis, and real-time text monitoring. All components are made publicly available.

iber, large language model, machine learning, (23 more...)

2502.1859

Country:

Europe (1.00)
North America > United States > Oregon (0.14)
North America > Canada > British Columbia (0.14)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (0.48)

Technology:

Information Technology > Communications > Social Media (0.95)
Information Technology > Data Science > Data Mining > Feature Extraction (0.50)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.48)
(2 more...)

arXiv.org Artificial IntelligenceFeb-14-2025

Domain-Invariant Per-Frame Feature Extraction for Cross-Domain Imitation Learning with Visual Observations

Kim, Minung, Lee, Kawon, Kim, Jungmo, Choi, Sungho, Han, Seungyul

Imitation learning (IL) enables agents to mimic expert behavior without reward signals but faces challenges in cross-domain scenarios with high-dimensional, noisy, and incomplete visual observations. To address this, we propose Domain-Invariant Per-Frame Feature Extraction for Imitation Learning (DIFF-IL), a novel IL method that extracts domain-invariant features from individual frames and adapts them into sequences to isolate and replicate expert behaviors. We also introduce a frame-wise time labeling technique to segment expert behaviors by timesteps and assign rewards aligned with temporal contexts, enhancing task performance. Experiments across diverse visual environments demonstrate the effectiveness of DIFF-IL in addressing complex visual tasks.

artificial intelligence, data mining, machine learning, (14 more...)

2502.02867

Country:

North America > United States (0.46)
Asia > South Korea (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Data Science > Data Mining > Feature Extraction (0.61)

Neural Information Processing SystemsFeb-10-2025, 17:43:52 GMT

PEER: A Comprehensive and Multi-Task Benchmark for Protein Sequence Understanding (Supplementary Material) Jiarui Lu

The DDE protein sequence feature vector is defined by the statistical features of dipeptides, i.e., two consecutive amino acids in the protein sequence. For example, the feature of dipeptide "st" is defined by its dipeptide composition (D The Moran feature descriptor defines the distribution of amino acid properties along a protein sequence. The Moran feature vector is with 8M dimensions (M is the parameter of maximum lag, setting as 30 following iFeature). Table 1: Balanced metric (weighted F1) compared with accuracy on multi-class classification tasks. We report mean (std) for each experiment.

data mining, machine learning, prediction, (14 more...)

Country: North America > Canada > Quebec (0.16)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Feature Extraction (0.56)

Neural Information Processing SystemsJan-27-2025, 00:28:41 GMT

RelationNet++: Bridging Visual Representations for Object Detection via Transformer Decoder

Existing object detection frameworks are usually built on a single format of object/part representation, i.e., anchor/proposal rectangle boxes in RetinaNet and Faster R-CNN, center points in FCOS and RepPoints, and corner points in Corner-Net. While these different representations usually drive the frameworks to perform well in different aspects, e.g., better classification or finer localization, it is in general difficult to combine these representations in a single framework to make good use of each strength, due to the heterogeneous or non-grid feature extraction by different representations. This paper presents an attention-based decoder module similar as that in Transformer [31] to bridge other representations into a typical object detector built on a single representation format, in an end-to-end fashion. The other representations act as a set of key instances to strengthen the main query representation features in the vanilla detectors. Novel techniques are proposed towards efficient computation of the decoder module, including a key sampling approach and a shared location embedding approach. The proposed module is named bridging visual representations (BVR). It can perform in-place and we demonstrate its broad effectiveness in bridging other representations into prevalent object detection frameworks, including RetinaNet, Faster R-CNN, FCOS and ATSS, where about 1.5 3.0 AP improvements are achieved. In particular, we improve a state-of-the-art framework with a strong backbone by about 2.0 AP, reaching 52.7 AP on COCO test-dev. The resulting network is named RelationNet++.

data mining, machine learning, natural language, (20 more...)

Country: North America > Canada (0.14)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Feature Extraction (0.35)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.34)

Neural Information Processing SystemsJan-23-2025, 15:42:55 GMT

Understanding Sparse JL for Feature Hashing

Meena Jagadeesan

Feature hashing and other random projection schemes are commonly used to reduce the dimensionality of feature vectors.

data mining, jl distribution, machine learning, (18 more...)

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.36)
Information Technology > Data Science > Data Mining > Feature Extraction (0.35)

Neural Information Processing SystemsJan-20-2025, 20:22:00 GMT

Reviews: Unsupervised Feature Extraction by Time-Contrastive Learning and Nonlinear ICA

There has been a lot of progress taking the successes of supervised techniques and extending them to unsupervised domains by defining an interesting label to predict, and this paper continues this trend by highlighting the task of discriminating between time windows. Furthermore, I think the emphasis on the importance of identifiability for nonlinear ICA and showing how it is solved in this case is a timely contribution. However, these contributions were somewhat muted by other shortcomings of the approach which make me doubt the robustness and generality of the method. This is a significant source of prior knowledge to include in the experiments and makes the comparisons unfair. In particular the comment that "none of the hidden units seem to represent artefacts, in contrast to ICA" rings a bit hollow since it seems that the elimination of artifacts was really achieved through hand-picked choice in the model.

artifact, time-contrastive learning and nonlinear ica, unsupervised feature extraction, (5 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.67)
Information Technology > Data Science > Data Mining > Feature Extraction (0.40)